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ABSTRACT 

N-Body simulations are a very important tool in the study of formation of large scale struc- 
tures. Much of the progress in understanding the physics of galaxy formation and comparison 
with observations would not have been possible without N-Body simulations. Given the im- 
portance of this tool, it is essential to understand its limitations as ignoring these can easily 
lead to interesting but unreliable results. In this paper we study the limitations due to the finite 
size of the simulation volume. We explicitly construct the correction term arising due to a fi- 
nite box size and study its generic features for clustering of matter and also on mass functions. 
We show that the correction to mass function is maximum near the scale of non-linearity, as a 
corollary we show that the correction to the number density of haloes of a given mass changes 
sign at this scale; the number of haloes at small masses is over estimated in simulations. This 
over estimate results from a delay in mergers that lead to formation of more massive haloes. 
The same technique can be used to study corrections to other physical quantities. The cor- 
rections are typically small if the scale of non-linearity is much smaller than the box-size. 
However, there are some cases of physical interest in which the relative correction term is of 
order unity even though a simulation box much larger than the scale of non-linearity is used. 
Within the context of the concordance model, our analysis suggests that it is very difficult for 
present day simulations to resolve mass scales smaller than 10 2 M Q accurately and the level 
of difficulty increases as we go to even smaller masses, though this constraint does not apply 
to multi-scale simulations. 

Key words: methods: N-Body simulations, numerical - gravitation - cosmology : theory, 
dark matter, large scale structure of the universe 



1 INTRODUCTION 

Large scale structures like galaxies and clusters of galaxies are be- 
lieved to have formed by gravitational amplification of small pertur- 
bations. For an overview and original references, see, e.g., Peebles 
(1980); Peacock (1999); Padmanabhan (2002); Bernardeau et al. 
(2002). Initial density perturbations were present at all scales that 
have been observed (Spergel et al. 2003; Hawkins et al. 2003; 
Pope et al. 2004). Understanding evolution of density perturba- 
tions for such initial conditions is essential for the study of for- 
mation of galaxies and large scale structures. The equations that 
describe the evolution of density perturbations in an expanding uni- 
verse have been known for a long time (Peebles 1974) and these 
are easy to solve when the amplitude of perturbations is small. 
These equations describe the evolution of density contrast defined 
as S(r, i) = (p(r,t) — p(t))/p(t). Here p(r,t) is the density at 
point r and time t, and p is the average density in the universe at 
time t. These are densities of non-relativistic matter, the component 
that clusters at all scales and is believed to drive the formation of 
large scale structures in the universe. Once density contrast at rel- 
evant scales becomes large, i.e., \8\ > 1, the perturbation becomes 



non-linear and coupling with perturbations at other scales cannot be 
ignored. The equation for evolution of density perturbations cannot 
be solved for generic perturbations in this regime. N-Body simu- 
lations (Bertschinger 1998; Bagla and Padmanabhan 1997b; Bagla 
2005) are often used to study the evolution in this regime. Alterna- 
tive approaches can be used if one requires only a limited amount 
of information and in such a case either quasi-linear approximation 
schemes (Zel'dovich 1970; , 1989; Matarrese et al. 1992; Brain- 
erd et al. 1993; Bagla & Padmanabhan 1994; Sahni & Coles 1995; 
Hui & Bertschinger 1996; Bernardeau et al. 2002) or scaling re- 
lations (Davis & Peebles 1977; Hamilton et al. 1991; Iain et al. 
1995; Kanekar 2000; Ma 1998; Nityananda & Padmanabhan 1994; 
Padmanabhan et al. 1996; Peacock & Dodds 1994; Padmanabhan 
1996; Peacock & Dodds 1996; Smith et al. 2003) suffice. 

In cosmological N-Body simulations, we simulate a represen- 
tative region of the universe. This is a large but finite volume and 
periodic boundary conditions are often used. Almost always, the 
simulation volume is taken to be a cube. Effect of perturbations at 
scales smaller than the mass resolution of the simulation, and of 
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perturbations at scales larger than the box is ignored. Indeed, even 
perturbations at scales comparable to the box are under sampled. 

It has been shown that perturbations at small scales do not 
influence collapse of perturbations at much larger scales in a sig- 
nificant manner (Peebles 1974, 1985; Little et al. 1991; Bagla & 
Padmanabhan 1997a; Couchman & Peebles 1998) if we study the 
evolution of the correlation function or power spectrum at large 
scales due to gravitational clustering in an expanding universe. This 
is certainly true if the scales of interest are in the non-linear regime 
(Bagla & Padmanabhan 1997a). Therefore we may assume that ig- 
noring perturbations at scales much smaller than the scales of inter- 
est does not affect results of N-Body simulations. However, there 
may be other effects that are not completely understood at the quan- 
titative level (Bagla, Prasad & Ray 2005) even though these have 
been seen only in somewhat artificial situations. 

Perturbations at scales larger than the simulation volume can 
affect the results of N-Body simulations. Use of periodic boundary 
conditions implies that the average density in the simulation box 
is same as the average density in the universe, in other words we 
ignore perturbations at the scale of the simulation volume (and at 
larger scales). Therefore the size of the simulation volume should 
be chosen so that the amplitude of fluctuations at the box scale 
(and at larger scales) is ignorable. If the amplitude of perturbations 
at larger scales is not ignorable then clearly the simulation will not 
be a faithful representation of the model being studied. It is not 
obvious as to when fluctuations at larger scales can be considered 
ignorable, indeed the answer to this question depends on the phys- 
ical quantity of interest, the model being studied and the specific 
length/mass scale of interest as well. 

The effect of a finite box size has been studied using N-Body 
simulations and the conclusions in this regard may be summarised 
as follows. 

• If the amplitude of density perturbations around the box scale 
is small (5 < 1) but not much smaller than unity, simulations un- 
derestimate the correlation function though the number density of 
small mass haloes does not change by much (Gelb & Bertschinger 
1994a,b). In other words, the formation of small haloes is not dis- 
turbed but their distribution is affected by non-inclusion of long 
wave modes. 

• In the same situation, the number density of massive haloes 
drops significantly (Gelb & Bertschinger 1994a,b; Bagla & Ray 
2005). 

• Effects of a finite box size modify values of physical quantities 
like the correlation function even at scales much smaller than the 
simulation volume (Bagla & Ray 2005). 

• The void spectrum is also affected by finite size of the sim- 
ulation volume if perturbations at large scales are not ignorable 
(Kauffmann & Melott 1992). 

• It has been shown that properties of a given halo can change 
significantly as the contribution of perturbations at large scales is 
removed to the initial conditions but the distribution of most inter- 
nal properties remain unchanged (Power & Knebe 2005). 

In some cases, one may be able to devise a method to "cor- 
rect" for the effects of a finite box-size (Colombi et al. 1994), but 
such methods cannot be generalised to all statistical measures or 
physical quantities. 

The effects of perturbations at scales larger than the box size 
can be added using MAP (Mode Adding Procedure) after a simu- 
lation has been run (Tormen & Bertschinger 1996). This method 
makes use of the fact that if the box size is chosen to be large 
enough then the contribution of larger scales can be incorporated 



by adding displacements due to the larger scales independently of 
the evolution of the system in an N-Body simulation. The motiva- 
tion for development of such a tool is to enhance the range of scales 
over which results of an N-Body simulation can be used by im- 
proving the description at scales comparable to the box size. Such 
an approach ignores the coupling of large scale modes with small 
scale modes and this again brings up the issue of what is a large 
enough scale for a given model such that the effects of mode cou- 
pling can be ignored. Large scales contribute to displacements and 
velocities, and variations in density due to these scales modify the 
rate of growth for small scales perturbations (Cole 1997). 

Effects of a finite box size modify values of physical quanti- 
ties even at scales much smaller than the simulation volume (Bagla 
& Ray 2005) (BR05, hereafter). In BR05, we suggested use of the 
fraction of mass in collapsed haloes as an indicator of the effect 
of a finite box size. We found that if the simulation volume is not 
large enough, the fraction of mass in collapsed haloes is underes- 
timated. As the collapsed fraction is less sensitive to box-size as 
compared to measures of clustering, several other statistical indi- 
cators of clustering to deviate significantly from expected values in 
such simulations. A workaround for this problem was suggested in 
the form of an ensemble of simulations to take the effect of con- 
vergence due to long wave modes into account (Sirko 2005), the 
effects of shear due to long wave modes are ignored here. It has 
also been shown that the distribution of most internal properties 
of haloes, e.g., concentration, triaxiality and angular momentum 
do not change considerably with the box size even though proper- 
ties of a given halo may change by a significant amount (Power & 
Knebe 2005). 

There is a clear need to develop a formalism for estimating the 
effect of perturbations at large scales on a variety of physical quan- 
tities. Without such a formalism, we cannot decide in an objective 
manner whether a simulation box size is sufficiently large or not. 
In this paper we generalise the approach suggested in BR05 and 
write down an explicit correction term for a number of statistical 
indicators of clustering. This approach allows us to study generic 
properties of the expected correction term in any given case, apart 
of course from allowing us to evaluate the magnitude of the cor- 
rection as compared to the expected value of the given statistical 
indicator. We apply this technique to mass functions in this paper. 



2 BASIC EQUATIONS 

Initial conditions for N-Body simulations are often taken to be a re- 
alisation of a Gaussian random field with a given power spectrum, 
for details see, e.g., Bagla and Padmanabhan (1997b); Bertschinger 
(1998); Bagla (2005). The power spectrum is sampled at dis- 
crete points in the k space between the scales corresponding to 
the box size (fundamental mode) and the grid size (Nyquist fre- 
quency/mode). Here k is the wave vector. Sampling of the power 
spectrum in initial conditions of N-Body simulations is dense to- 
wards the Nyquist mode, but is sparse for modes near the funda- 
mental mode. Power spectra for density, potential and the velocity 
field are related to each other in the linear regime 1 . The power spec- 

1 Density and Potential are related through the Poisson equation and hence 
the knowledge of power spectrum of one can be used to compute the power 
spectrum for the other quantity. These quantities at late times are obtained 
through evolution in which mode-coupling is significant and hence the ef- 
fects of missing modes are not easy to quantify. The sampling of initial 
conditions is, in our considered view, more relevant and easier to quantify 
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tra can be used to compute the second moment; either two point 
functions or rms fluctuations. In view of the sampling of the power 
spectrum in initial conditions, the second moment can be expressed 
as a sum over power spectrum at these points, weighted by an ap- 
propriate window function. 

In the peak picture, most quantities of interest can be related to 
the two point correlation function (Bardeen et al. 1986), therefore 
a method for estimating box-size correction to the second moment 
can be used as a base for computing correction for other physical 
quantities. 



2.1 Clustering Amplitude 

We now present our approach for estimating the effects of a finite 
box size on physical quantities in the linear limit. We will illustrate 
our approach using rms fluctuations in mass cr(r), but as shown 
below, the basic approach can be generalised to any other quantity 
in a straightforward manner. In general, a(r) may be defined as 
follows: 



2 (r) 



k 2ty 2 



(1) 



Here P(k) is the power spectrum of density contrast, r is the co- 
moving length scale at which rms fluctuations are defined, k — 
yjkl+k'^ + kl is the wave number and W(kr) is the Fourier 
transform of the window function used for sampling the density 
field. The window function may be a Gaussian or a step function 
in real or fc-space. We choose to work with a step function in real 
space where W(kr) — 9 (sin kr — kr cos kr) 2 /(fc 6 r 6 ), see e.g., 
§5.4 of Padmanabhan (1993) for further details. In an N-Body sim- 
ulation, the power spectrum is sampled only at specified points in 
the fc-space. In this case, we may write a (r) as a sum over these 
points. 
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Figure 1. This figure shows the first correction term Ci (see Eqn.(5)) for 
power law models with index n normalised such that cr ( 2 (r n ;) = 1. The 
curves here are for L box /r n i = 16 (dot-dashed curve), L box /r nt = 128 
(dashed curve) and L box /r n [ = 512 (solid curve). C\ is plotted as a func- 
tion of n + 3 and we find that the correction term increases as n + 3 — > 0. 
See text for more details. 



approximate the sum over the k modes sampled in initial conditions 
by an integral. Further, we make use of the fact that small scales 
do not influence large scales to ignore the error contributed by the 
upper limit of the integral. This approximation is valid as long as 
the scales of interest are more than a few grid lengths. 

In the approach outlined above, the value of a 2 at a given scale 
is expressed as a combination of the expected value a 2 and the 
correction due to the finite box size a\. Here a 1 is independent 
of the box size and depends only on the power spectrum and the 
scale of interest. It is clear than o 2 {r, Lbox) < a o{ r ) an d a l so 
a 2 (r, Lbox) > 0. It can also be shown that for hierarchical models, 
da 2 (r, L hox )/dr < 0, i.e., a 2 (r, L box ) increases or saturates to a 
constant value as we approach small r. 

If the scale of interest is much smaller than the box-size L box 

then, 
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Here o-qM i s me expected level of fluctuations in mass at scale r 
for the given power spectrum and a 2 (r, Lbox) is what we get in 
an N-Body simulation at early times. We have assumed that we can 



than the effects of missing mode-coupling terms. Therefore, in our discus- 
sion, we deal with the initial or the linearly evolved power spectra of various 
quantities. 
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Figure 2. The top panel shows curves of constant C\ (Lbox) /oq (r) on the 
r — ^box plane for the ACDM model (see text for details). Lines mark 
Ci/crg = 0.01, 0.03, 0.1, 0.3 and 0.5, from top to bottom. The lower 
panel shows the scale of non-linearity r n i as a function of redshift for the 
ACDM model. 



The small parameter in the expansion is r/Lb ox - This expansion is 
useful if k P(k) goes to zero as we approach k = 0. It is interest- 
ing to note that the first term is scale independent. The numerical 
values of d can be used to estimate the scale below which ai can 
be approximated by a constant. Later terms are scale dependent and 
the noteworthy feature is that modes closer to 2n / Lbox contribute 
more significantly to the integral for most models. 

It is noteworthy that the first term, Ci, has the same value 
for all choices of window functions that approach unity at small k. 
By virtue of this fact, Ci is also the correction for the two point 
correlation function £(r) at sufficiently small scales. 

At large scales <7y(r) and a\ (r, L box ) have a similar magni- 
tude and the rms fluctuations in the simulation become negligible 



Table 1. This table lists corrections due to a finite box-size to indicators 
of clustering in the limit r <C -Lbox- These expressions are equivalent to 
Eqn.(4) and constants d are the same as in that equation. 



Indicator 


Correction 






Ci - |C 2 r 2 + §§C 3 r 4 


+ C>(r 6 ) 


CM 


Ci - \C-2r 2 + ^C 3 r 4 


+ C(r 6 ) 



compared to the expected values in the model. As we approach 
small r the correction term crf(r, Lbox) is constant and for most 
models it becomes insignificant in comparison with a%(r). In mod- 
els where o"o( r ) increases very slowly at small scales or saturates 
to a constant value, the correction term erf can be significant at all 
scales. This can be seen from the expression for Ci for power law 
models (P(k) = Ak n ) 

Ci _ i A (JM n+a (5) 

n + 3 2ty 2 V Lbox / 

Clearly, this term becomes more and more significant as n — > —3. 
Figure 1 illustrates this point, here Ci is shown as a function of 
n + 3. We fix A by choosing a scale of non-linearity r n i such that 
o"o(Vni) = 1. Curves are plotted for three values of Lbox/r n ;: 
LboxAni = 16 (dot-dashed curve), L box /r n i = 128 (dashed 
curve) and L bo ^/r n i = 512 (solid curve). As <tq is unity at the 
scale of non-linearity and Ci is the first order correction, clearly we 
require Ci <C 1 for the error due to box size to be small and hence 
ignorable. If we fix Ci < 0.1 then we can simulate n = — 1 with 
Lbox/r„; = 16 but for more negative indices we require a larger 
separation between the box size and the scale of non-linearity. We 
can just about manage n = —2.3 with L bo ^/r n i — 128 with the 
same threshold on error, and with Lbox/j"n; = 512 we can go up 
ton = —2.5. As N-Body simulations are most useful for studying 
non-linear evolution, even the largest simulations possible today are 
left with a small range of scales over which cro > 1 for n < —2.0. 
This shows the pitfalls of simulating models with n ~ — 3 over the 
entire range of scales. 

Figure 2 (top panel) shows lines of constant C\ja% in the 
Lbox — r plane for the ACDM. We chose n = 1, h = 0.7, 
Qa = 0.7, Q„ r = 0.3 and as = 0.9. We ignored the effects of 
Baryons on the power spectrum. From top to bottom, the lines cor- 
respond to Ci/o-q = 0.01, 0.03, 0.1, 0.3 and 0.5. It is noteworthy 
that a box-size smaller than 0.5Mpc is precluded if we insist on 
C'i(Lbox)/o"o( r ) ^ 0.1, irrespective of the scale of interest. This 
implies that we cannot expect to simulate scales smaller than about 
0.5 kpc in the ACDM model without considerable improvement in 
the dynamic range of cosmological N-Body simulations. As we are 
using linearly evolved quantities for our argument, the comments 
on box size are valid irrespective of the redshift up to which the 
simulation is run. The contours do not change if we use a\ instead 
ofCi. 

The lower panel of the same figure shows the scale of non- 
linearity for the ACDM model as a function of redshift. 

This formalism can be used to estimate corrections for other 
estimators of clustering. For reference, we have given expressions 
equivalent to Eqn.(4) for the correction to £ and f in Table 1. 
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2.2 Velocities 

We can use the method outlined above to estimate finite box correc- 
tions to the velocity field. Velocities and density contrast are related 
to one another (Peebles 1980) in the linear regime. The power spec- 
tra for these two are related as P v (k) oc P(k)/k 2 . Thus bulk ve- 
locities at any given scale get a larger contribution from the power 
spectrum at large scales (small k) than density fluctuations. This 
implies that the correction term must be more significant for veloc- 
ities than the equivalent correction for the clustering amplitude. We 
will discuss the corrections in velocity field in detail in a follow up 
paper. 



2.3 Mass Function 

We can use the explicit correction for rms fluctuations (a) to esti- 
mate the correction for mass functions of haloes. We use the Press- 
Schechter approach (Press & Schechter 1974; Bond et al. 1991), 
but we also give results for the Sheth-Tormen mass function (Sheth 
& Tormen 1999; Sheth, Mo, & Tormen 2001) in order to demon- 
strate that our results are generic in nature. 

The mass fraction in collapsed haloes with mass greater than 
M is given in the Press-Schechter model by 
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Where S c (— 1.68 for Einstein-de Sitter cosmology) is a parame- 
ter 2 and M is related to the scale r through the usual relation. We 
can write F as the contribution expected in the limit L b ox > 
and a correction due to the finite box size. 
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The correction to F(M) due to the finite box size always leads 
to an under-estimate as Fi(M, L box ) is always positive. This is 
consistent with what we found in BR05. However, Fi(M, L box ) is 
not a monotonic function of M as it goes to zero at small as well as 
large M. At small M (M < M n if, the limits of the integral differ 
by a very small amount. This difference (<5 c a 2 /2\/2ao) keeps on 
decreasing as we get to small M while the integrand remains finite. 
Therefore we expect Fi to decrease at small M. At these scales, 
we can write an approximate expression for Fi(M): 



Fl(M) : 



2tt 01 



— exp 



2a 2 



(8) 



2 In the spherical collapse model, this is the linearly extrapolated density 
contrast at which we expect the halo to virialise (Gunn & Gott 1972). 



M n i is the mass corresponding to the scale where cto 
assume that L box is much larger than this scale. 
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Figure 3. The Press-Schechter mass function and correction terms are plot- 
ted as a function of mass. Fq(M) (solid curve), F(M) (dashed curve) and 
Fi(M) (dot-dashed curve) are shown here. The scale where ao = 6 C /V3 
is marked with a vertical dotted line, we see that this estimate coincides 
with the maximum of F\(M). The correction term Fi(M) is more than 
10% of Fo(M) at this scale. Also shown is the approximate expression 
Eqn.(8) for F\{M) (dot-dot-dot-dashed curve) and we note that it follows 
the actual curve to masses greater than M nl . Mass here is plotted in units 
of mass of each particle and we assumed that the scale of non-linearity is 8 
grid lengths. 



This clearly decreases as we go to small M: G\ goes over to the 
constant Ci and ao keeps increasing. 

At large M (M > M nt ), both a(M,L box ) and a (M) are 
small and the limits of the integral cover the region where the in- 
tegrand is very small. Thus we expect F\(M, L box ) to become 
smaller as we go to larger M in this regime. At these scales, we 
also expect _Fo and F\ to become almost equal while F(M) goes 
to zero faster than either term. Therefore F\(M, L box ) must have 
a maxima at an intermediate scale. The scale at which the maxima 
occurs can be found by solving the following equation. 



d log a 2 
dlog a 2 



(To \ af } 
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exp 
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(9) 



Here, the second equation is obtained if a\ <g; ao. If I/ box 
r n i, where r n i is the scale of non-linearity then a\ is very well 
approximated by the Taylor series Eqn.(4) around this scale and a\ 
is a very slowly varying function of scale. Thus Fi(M, L box ) has 
a maxima at ao = 5 2 /3 ~ 1 if the first term in Eqn.(4) is a good 
approximation for o\. If scale dependent terms in Eqn.(4) are not 
ignorable then the maxima of F\ (M, L box ) shifts to smaller scales 
(larger ao) in a manner that depends on the power spectrum and 
box-size L box . 

Figure 3 shows the Press-Schechter mass function F(M) for 
a power law model with n = —2, L box /r n ; = 16. We have plot- 
ted Fo{M) (solid curve), F(M) (dashed curve) and Fi(M) (dot- 
dashed curve) as a function of M. The scale where ao = 5 C / y/3 is 
marked with a vertical dotted line, we see that this estimate is close 
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Figure 4. The Press-Schechter multiplicity function and correction terms 
are plotted as a function of mass. fo(M) (solid curve), f(M) (dashed 
curve) and fi(M) (dot-dashed curve) are shown here. The scale where 
cro = <5c/v3 is marked with a vertical dotted line, we see that this estimate 
coincides with change of sign for /i(M). At scales below this, the correc- 
tion term fi (M) is positive and hence there are more haloes in simulation 
than expected in the model. Also shown is the approximate expression for 
/l(M) (dot-dot-dot-dashed curve). Mass here is plotted in units of mass 
of each particle and we assumed that the scale of non-linearity is 8 grid 
lengths. 



to the maximum of F\(M). The correction term F\{M) is more 
than 10% of Fo(M) at this scale. Also shown is the approximate 
expression Eqn.(8) for Fi(M) (dot-dot-dot-dashed curve) and we 
note that it follows the actual curve to masses greater than M„(. 
This figure illustrates all the generic features of corrections to mass 
function that we have discussed above. 

The multiplicity function / is often defined as the fraction of 
mass in a logarithmic interval in mass: 



f(M, L box )dlogM 
f(M, L box ) 



-dlogM 



0F(M,L bo 
dlogM 
dFp(M) aFi(M,L b „x) 
dlogM dlogM 
/o(M)-/ 1 (M,L box ). 



(10) 



It is not possible to reduce this expression further while writ- 
ing the correction term due to the finite box size separately. We 
can, however, ascertain generic properties of the correction term 
fx (M, Lbox) from our understanding of Fi (M, L box )- At large M, 
fi is positive as Fi(M, Lbox) decreases with increasing M. Thus 
the mass fraction of haloes in this mass range is underestimated in 
simulations. For typical models and simulations, this is the most 
significant effect of a finite box size. 

We know that /i has a zero near the scale of non-linearity as 
Fi has a maxima here. Thus there is a scale where corrections for 
the multiplicity function due to a finite box size vanish. At smaller 
scales, the slope of Fi and hence /i changes sign and the correction 
to mass fraction in haloes is positive. A finite box size leads to an 
over estimate of number of low mass haloes. This over estimate is 



caused by absence of long wave modes, as the low mass haloes do 
not merge to form the high mass haloes. 

The magnitude of over estimate depends on cri, and hence 
on the slope of the power spectrum and Lbox- In the limit of 
M <C M„i, we can use Eqn.(8) to compute the magnitude of over 
estimate: 



/(M) ~ MM) + 



3S C Ci 



dlogM 



(11) 



Here we have ignored the contribution of the exponential term in 
Eqn.(8). The correction term scales as M ( - n+i ^ 2 for power law 
models, thus it is significant even at small mass scales if n ~ —3. 
Clearly, the term is also large for CDM like power spectra if the 
slope of the power spectrum is close to —3 at all scales in the sim- 
ulation volume. 

Figure 4 shows the Press-Schechter multiplicity function and 
correction terms as a function of mass for the model used in Fig- 
ure 3 (Power law model with n — —2, LboxA'ni = 16.). The 
expected multiplicity function fo(M) (solid curve), what is ex- 
pected in the simulation f(M) (dashed curve) and the correction 
term fi(M) (dot-dashed curve) are shown here. The scale where 
o"o = dc/VS is marked with a vertical dotted line, we see that this 
almost coincides with change of sign for /i(M) 4 . At scales be- 
low this, the correction term fi(M) is positive and hence there are 
more haloes in the simulation than expected in the model. The rel- 
ative magnitude of the correction term is large for M > M n i and 
this is the most significant effect of a finite box-size on the mass 
function. The over estimate of the multiplicity function is typically 
a sub-dominant effect, as it is for the model shown here. However, 
as we shall see below, this effect can be very significant in some sit- 
uations. Also shown in the figure is the approximate expression for 
/i(M) (dot-dot-dot-dashed curve) in the limit M < M ni . Unlike 
the approximation for F\ (M) which is accurate over a large range 
of scales, this is expected to be valid only in the limit of M <C M ni 
and indeed, is off by about a factor of two at the smallest scales 
shown here. However, it is a good approximation if we go to even 
smaller masses. We note that for this model, the over estimate of 
multiplicity function due to the finite box is small and therefore is 
difficult to detect. For this model, Ci /ctq ~ 0.2 at the scale of non- 
linearity and is smaller than 0.1 at scales where the over estimate in 
f(M) is maximum. At small scales, fi/fo is typically of the same 
order of magnitude as Ci/o'^. 

2.3.1 Sheth- Tormen Mass Function 

We now give corresponding formulae for the Sheth-Tormen mass 
function (Sheth & Tormen 1999; Sheth, Mo, & Tormen 2001). The 
definition of mass function (Eqn.6) is modified to: 



F(M,L b ox) = 



A(l+x-' 2q )cxp [-x 2 ] dx(l2) 



6 c /a(M,lbx)V2 



In the limit of A — 0.5 and q — this is identical to the usual 
Press-Schechter mass function (Eqn.6). The maxima of the correc- 
tion term (Li(M, L box )) occurs when the following equation is 
satisfied: 



d log of 
dlog a\ 













M) 


cro 





4 The change of sign happens at cro 
with5 c = 1.68. 



1 instead of cro = 0.97 drawn here 
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Figure 5. Shown here is the number density of haloes n(M)dM in the 
mass range M — M + dM for these two simulations. The solid line shows 
the number density of haloes in the reference simulation (L box = 256). 
Number density of haloes in the simulation with Lbox = 64Mpc is shown 
by the dashed line. 



exp 



d c a 1 



2a^a 2 



1 + 



V2a 



-2q 



1 + 



\ 2n 



-2q 



V2< 



<T() 



-2q 



(13) 



As before, this reduces to the expression in the Press-Schechter 
case (Eqn.9) in the limit q = 0. The qualitative features of the 
finite box correction to mass function are the same for the two pre- 
scriptions and may be considered to be generic. For reference, we 
write approximate expressions for correction to the mass function 
F(M): 



2tt cr t 



— exp 









f s c y 2 «~ 




A 


1 + 


(14) 


2a 2 _ 


\ V2a J 



and the multiplicity function f(M): 
3(5 c C\ ( dao 



f 



A 



1 + 



(-1) 



v/2. 



-2q 



(15) 



2tt &o \dlogM 
for the Sheth-Tormen mass function. 

2.3.2 N-Body Simulations 

We present here some preliminary results of a comparison of our 
results with N-Body simulations. We do not try to fit either the 
Press-Schechter or the Sheth-Tormen mass functions to simulations 
here, instead we use a simulation with a larger L box as reference 
and compare the number density of haloes as a function of mass 
with another simulation run using a smaller Lbox- More detailed 
results obtained from N-Body simulations will be presented in a 
later publication. 



We simulated the n — — 2 power law model in an Einstein- 
de Sitter universe with the normalisation r n i = 8Mpc at the final 
epoch. We chose one grid length of the simulation to equal IMpc. 
The simulation was run with two values of the box-size: Lbox = 
64Mpc and Lbox — 256Mpc, with the latter being the reference. 
The simulations were run using the TreePM method (Bagla 2002; 
Bagla & Ray 2003). The parallel TreePM code was used for the 
256 3 simulation (Ray & Bagla 2004). 

Figure 5 shows the number density of haloes n(M)dM in the 
mass range M — M + dM for these two simulations. Note that 
following our definitions n(M) = f(M) /M 2 , where f(M) is the 
multiplicity function. The solid line shows the number density of 
haloes in the reference simulation. One can see the approximately 
power law variation at small M and a rapid fall off at large M. 
Number density of haloes in the simulation with Lb ox = 64Mpc is 
shown by the dashed line. As expected from the above discussion, 
the deviation from power law starts at smaller masses as the number 
density of very massive haloes is under-estimated as compared to 
the reference simulation. At smaller M, we find about 10% more 
haloes in this simulation as compared to the reference. It is note- 
worthy that the number density of low mass haloes remains above 
that in the reference simulation at all masses below the rapid drop 
around 10 2 Mq. Both the features follow the predictions in the pro- 
ceeding discussion, indeed we have shown that these features are 
independent of the specific analytical form for the mass function. 
Here we have also shown that the same behaviour is reproduced 
in N-Body simulations. A more detailed comparison is beyond the 
scope of this paper and the results will be presented in a later pub- 
lication. 



3 DISCUSSION 

In the preceeding sections, we have described a method to estimate 
errors in the descriptors of clustering in the linear regime. We have 
also shown that the key results of the analytical study are borne out 
by N-Body simulations. We have shown that the error is typically 
small if the scale of interest is sufficiently smaller than the box size. 
An implicit requirement is that the scale of non-linearity too should 
be much smaller than the box size; if this restriction is overlooked 
then we not only ignore power in modes larger than the simulation 
box but also the significant effects of mode coupling from scales 
in the mildly non-linear regime. Therefore, we require r, r„i <g 

Lbox- 

We propose using a\ / a% as an indicator of the significance of 
the finite box size, any descriptor of second moment can be used but 
a has the virtue of being positive definite at all scales. Our proposal 
is that a 2 (r)/a 2 (r), a 2 (r n i(z))/a 2 (r n i(z)) < 1, for the finite 
box-size corrections to be ignorable. All the as are evolved linearly 
here. Conversely, the ratio a\ /ao at the scale of interest is indica- 
tive of the magnitude of correction due to the finite box-size. For a 
given relative magnitude of the correction term (ai/ao), Lbox/r„; 
is required to be larger for spectra with more large scale power. In- 
deed, the required L^ ox /r n i approaches infinity as the slope of the 
power spectrum approaches —3. 

As a result of finite box-size corrections, the amplitude of den- 
sity perturbations is not a power law and the range of scales over 
which it can be approximated by one becomes smaller as we ap- 
proach n + 3 — * 0. In the linear regime, the radial pair velocity 
is related directly with £ (Peebles 1980; Nityananda & Padmanab- 
han 1994). As f is not a pure power law in simulations due to box- 
size corrections, we expect that the pair velocities must also deviate 
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Figure 6. The multiplicity function expected in the ACDM model (see text for details). The top row is for Press-Schechter mass function and the lower row 
is for Shefh-Tormen mass function. The left column is for z = 20 and the right column is for z = 15. The expected multiplicity function is plotted as a 
function of mass, shown in each panel by a solid curve. Other curves correspond to multiplicity for a finite simulation box: L box = 5h _1 kpc (dotted curve), 
^box = 20h~ 1 kpc (dot-dashed curve) and L box = 100h~ 1 kpc (dashed curve). These correspond to C\/a^ ~ 0.6, 0.3 and 0.19, respectively. 



from expected values. This, in turn leads to deviations from scale 
invariant growth of density perturbations. This explains the diffi- 
culty in getting scale invariant evolution for models like n = —2 in 
N-Body simulations (Jain & Bertschinger 1996, 1998). For realis- 
tic models like the ACDM, the correction term is significant only if 
the scales of interest are below a few kpc and becomes larger as we 
move to smaller scales (see Figure 2). Indeed, at these small scales 
we may require L box /r ~ 10 4 or even greater in order to manage 
Ci/ctq = 0.1. Of course, a bigger simulation volume is required if 
we demand better accuracy. On the other hand, if we are interested 
in scales larger than 10 2 kpc, present day simulations are sufficient 
for keeping Ci/ct 2 < 0.01. 

We have shown that at sufficiently small scales the correction 
due to a finite box size can be written as a series of progressively 



smaller terms. The first correction term (Ci) is shown to be positive 
definite. We have also shown that the first correction term is the 
same for two point correlation function and a' 2 , indeed it is the same 
for all descriptors of the second moment for which the effective 
window function goes to unity at small k. 

As an application of our method, we discussed the correc- 
tion to mass function and multiplicity function using the Press- 
Schechter as well as the Sheth-Tormen approach. We have given 
the explicit form of the correction term due to finite box size in 
each case. We have also given approximate expressions for the cor- 
rection term and have shown that the approximation is very good in 
case of mass function. The mass function is always under estimated 
in simulations due to finite box-size corrections. Multiplicity func- 
tion, and hence also the number density of haloes of a given mass 



Effects of the size of cosmological N-Body simulations on physical quantities 9 



are underestimated at M > M n i. At smaller mass scales, however, 
the multiplicity function is over estimated and we find more haloes 
in a simulation than expected in the model. The mass scale at which 
the cross over from under estimate to over estimate occurs is given 
by Eqn.(lO) for Press-Schechter and Eqn.(14) for Sheth-Tormen 
mass function. 

The over estimate at small scales is related to the under esti- 
mate of mass in haloes at large scales. If the full power spectrum 
had been taken into account, the smaller haloes would have merged 
to form more massive haloes. In absence of large scale modes, the 
formation of massive haloes is slowed down and a larger number of 
low mass haloes survive. A detailed analysis of the effect of finite 
box size correction on merger rates for haloes will be presented in 
a forthcoming paper. Of significant interest is the impact on rates 
of major mergers (Cohn, Bagla and White 2001) as these have im- 
plications for observations. 

We find that the over estimate in multiplicity function is 
large whenever the ratio af(r, L box )/ao(r) ~ Ci(L bo x) A>o(r) 
is large. To illustrate this correlation, we have plotted the multi- 
plicity function fo(M) for the ACDM model in Figure 6. This 
has been plotted for redshift z = 20 and z = 15 and the 
mass range has been chosen such that very large box size is re- 
quired to keep (Ti(r, Lbox)/co( r ) smaller than 0.1. We have also 
plotted /(M, L b ox) here, with L box = 5h~ x kpc (dotted curve), 
Lbox = 20h~ 1 kpc (dot-dashed curve) and L box = 100h _1 kpc 
(dashed curve). These correspond to Ci/oq — 0.6, 0.3 and 0.19, 
respectively. The top row is for the Press-Schechter mass function 
and the lower row is for the Sheth-Tormen mass function. An iden- 
tical x — y range has been used to highlight the differences between 
the two models for mass function as well. It is noteworthy that the 
relative error is similar in both the cases even though the multiplic- 
ity function itself is different. At z = 20, the multiplicity func- 
tion is under estimated by a large amount for Lb ox = 5h _1 kpc, 
even though Lbox/V„! ~ 120 and if we are interested in scales 
around 1 pc then Lbox A — 5000. The situation at small masses 
is better for the other two simulation volumes considered here. For 
z = 15, the scale of non-linearity is r n i — 1.4h~ kpc, very close 
to Lbox = 5h _1 kpc and hence we do not expect believable results 
for this box-size. Indeed, the two panels on the right demonstrate 
the large errors and the absurdly incorrect shape of the multiplicity 
function. The difference in f(M) and /o(M) at 10~ 6 M© is about 
25% for Ci(Lbox)/o"o = 0.3, in this case L box = 20h _1 kpc 
and Lbox/r — 2 x 10 4 . The error in the multiplicity function 
is slightly larger than 10% for Lbox = 100h _1 kpc even though 
Lbox/r — 10 5 and Lb ox /r„i ~ 67. The multiplicity function plot- 
ted here is the global function, and the conditional mass functions 
should be used in order to estimate errors in simulations where a 
high peak is studied at better resolution. Similar results are obtained 
for other mass functions that have been suggested as a better fit to 
simulation data (Jenkins et al. 2001; Warren et al. 2005). 

The above discussion demonstrates the perils of using simu- 
lations where Ci(Lbox)/o"o(r) is close to unity. One may argue 
that models for mass function have not been tested in this regime 
where the local slope of the power spectrum is very close to —3, 
but the fact that error in amplitude of density perturbations itself is 
large should be reason enough to worry about reliability of results. 
Further, the agreement in the magnitude of errors for the several 
approaches to mass functions also gives us some confidence in re- 
sults. 

Majority of simulations are not affected by such serious errors, 
as the slope of power spectrum approaches —3 only at very small 
scales (large wave numbers). However, high resolution simulations 



of earliest structure formation in the ACDM model need to have a 
very large dynamic range before the results can be believed within 
10% of the quoted value. Indeed, our work may have some rele- 
vance to the ongoing discussion about the Earth mass haloes (Die- 
mand, Moore, & Stadel 2005; Zhao et al. 2005; Zentner, Koushiap- 
pas, & Kazantzidis 2005; Moore et al. 2005; Zhao et al. 2005). 



4 CONCLUSIONS 

Conclusions of this work may be summarised as follows. 

• We have presented a formalism that can be used to estimate 
the deviations of cosmological N-Body simulations from the mod- 
els being simulated due to the use of a finite box size. These devi- 
ations/errors are independent of the specific method used for doing 
simulations. 

• For a given model, the deviations can be expressed as a func- 
tion of the scale r of interest and Lbox, the box size of simulations. 

• We have applied the formalism to study deviations in rms fluc- 
tuations in mass in the initial conditions. 

• We find that the errors are small except for models where the 
slope of the power spectrum is close to —3 at scales of interest. 

• The errors in case of the ACDM model are significant if the 
scale of interest is smaller than a kpc even if simulations as large as 
the Millennium simulation (Springel et al. 2005) are used. 

• We have studied errors in mass function in the Press-Schechter 
model, as well as other models. 

• The main error due to a finite box size is that the number of 
high mass haloes is under estimated. 

• The number of low mass haloes is over predicted in simula- 
tions if the box size effects are important. This happens as low mass 
haloes do not merge to form the (missing) high mass haloes. 

• We have verified these trends using N-Body simulations. 

We note that it is extremely important to understand the 
sources of errors in N-Body simulations and the magnitude of er- 
rors in quantities of physical interest. N-Body simulations are used 
to make predictions for a number of observational projects and also 
serve as a test bed for methods. In this era of "precision cosmol- 
ogy", it will be tragic if simulations prove to be a weak link. We 
would like to note that our results apply equally to all methods of 
doing cosmological N-Body simulations, save those where tech- 
niques like MAP are used to include the effects of scales larger 
than the simulation volume. 

The method for estimating errors due to a finite box-size de- 
scribed in this paper can be used for several physical quantities. In 
this paper we have used the method to study errors in clustering 
properties and mass functions. We are studying the effect of finite 
box size on velocity fields and related quantities, the results will be 
presented in a later publication. 
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